Normalized Relevance Distance - A Stable Metric for Computing Semantic Relatedness over Reference Corpora
نویسندگان
چکیده
We propose the Normalized Relevance Distance (NRD): a robust metric for computing semantic relatedness between terms. NRD makes use of a controlled reference corpus for a statistical analysis. The analysis is based on the relevance scores and joint occurrence of terms in documents. On the basis of established reference datasets, we demonstrate that NRD does not require sophisticated data tuning and is less dependent on the choice of the reference corpus than comparable approaches.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملComputing Semantic Relatedness in German with Revised Information Content Metrics
The paper presents an application of information content based metrics to compute semantic relatedness of word senses in German. The main contributions are: an annotation study based on a revised definition of semantic relatedness beyond synonymy, an extension of Resnik’s (1995) procedure for computing information content of concepts for strongly inflected languages, an application of informati...
متن کاملCzech Dataset for Semantic Similarity and Relatedness
This paper introduces a Czech dataset for semantic similarity and semantic relatedness. The dataset contains word pairs with hand annotated scores that indicate the semantic similarity and semantic relatedness of the words. The dataset contains 953 word pairs compiled from 9 different sources. It contains words and their contexts taken from real text corpora including extra examples when the wo...
متن کاملSemantic Relatedness Estimation using the Layout Information of Wikipedia Articles
Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014